Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guarantee column order of objects. With headers. #2

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zevero
Copy link

@zevero zevero commented May 24, 2016

Order is never guaranteed in Objects, so I wonder that it worked for you up until now...

@jczaplew
Copy link
Owner

It has always worked because v8 is very consistent with Object key ordering, with the exception of integers as keys - http://stackoverflow.com/questions/280713/elements-order-in-a-for-in-loop/280861#280861

I made some edits on a different branch that should guarantee a consistent order. Could you please take a look and let me know if it meets your needs? https://github.com/jczaplew/csv-express/tree/ordered

Thank you for your contribution!

@zevero
Copy link
Author

zevero commented May 25, 2016

oh ... I did not mean Object key ordering, which is consistent in most browsers and v8 (albeit not in spec). And yes in my code I could have avoided to have both csvHeader_arr and csvHeader_obj if I had chosen to rely on it ;)

My pull request addresses another issue:
My array of object comes from a mongodb and key order is dependent on the order they were saved. Some objects have missing keys, some have more keys.

So if you have a look in my code you can see, that I do not simply take the keys from the first object in the array, but I collect all keys in all of the objects in the order they come and give them a fixed position, which ensures that the same key (values) are in the same column (with correct key in header).

@amadeus-torwell
Copy link

+1

I see a critical issue when you have inputs like this:

Works: [{a: 1, b: 2}, {a: 3, b: 4}]

Column 'b' will be missing: [{a: 1}, {a: 3, b: 4}]

So instead of relying on the first object, we would have to create a full table index based on all inputs. What do you think?

@zevero
Copy link
Author

zevero commented May 22, 2018

This is exactly why I have opened this pull request 2 years ago ... yes I am creating a full table index.

@jczaplew
Copy link
Owner

@deusama and @zevero - I see the issue, but it seems that creating a full table index could result in a large performance penalty, both in terms of parsing speed and potentially memory usage.

One way this could be addressed is by modifying the package to accept a list of headers, i.e. declaring the object order a priori. It would look something like this:

let headers = ['a', 'b']
let data = [ { 'a': 1}, {'b': 2, 'a': 3} ]

let body = ''
// Loop through the data
for (let i = 0; i < data.length; i++) {
  // Iterate on the known headers
  for (let header = 0; header < headers.length; header++) {
    // For each data object, check if it contains the known key
    if (data[i][headers[header]] {
       body += data[i]
    }
  }
}

I'd still like to benchmark this, but it removes the necessity of looping through all of the data twice, once to find all headers and again to parse the data.

Thoughts?

@amadeus-torwell
Copy link

amadeus-torwell commented May 22, 2018

Thanks for the quick reply @jczaplew!

Could be done this way, or using a configuration flag having an option for 'first-row-is-head' sort of thing. So the header could be within the data: [{a, b}, {a: 1}, {a: 3, b: 4}]

And a second thought - yes I like the idea of having a configuration here. So we can specify the data we want to see in the csv outputs.

@zevero
Copy link
Author

zevero commented May 22, 2018

Have a look at my Version of the code https://github.com/zevero/csv-express/blob/b60518d748f2d7df41fcc374c5f29bd308d8089a/lib/csv-express.js

There is no performance penality. As data is parsed there is just a check if the key is known. If not it is simply added.

Of course the header could be predefined. But I would rather not like to bother. For me export-csv should JUST export whatever data there is into csv format. For me, it was just important, that the values, don't end up under the wrong header.

@amadeus-torwell
Copy link

Sorry to ask, but do we have progress? :-)

Thanks,
Amadeus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants